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Nonparametric  methods  of  constructing  confidence  regions  for  the 
location  vectors  in  the  multivariate  one-sample  and  two-sample  prob- 
lems are  provided.   These  methods  are  based  on  a  class  of  rank  order 
statistics  studied  in  detail  by  the  authors  in  [8,  9,  11,  13].   Spe- 
cifically, nonparametric  confidence  regions  based  on  Bonferroni  in- 
equality, the  maximum  modulus,  and  Scheffe's  method  are  studied.   The 
results  obtained  are  nonparametric  generalizations  of  some  of  the 
results  of  Dunn[3,4]  and  Sidak  [15].   Certain  optimallty  properties 
of  the  proposed  methods  are  also  established. 

1.   INTRODUCTION 

The  problem  of  constructing  confidence  regions  for  the  vector  of  mean  values 
of  a  multivariate  normal  distribution  has  been  considered  by  various  workers.   The 
procedure  most  commonly  employed  is  the  one  based  on  Hotelling's  T^-statistic. 
However,  since  this  procedure  does  not  yield  clear  cut  confidence  statements  for 
the  individual  mean  values,  statisticians  are  often  led  to  the  problem  of  finding 
confidence  regions  based  on  individual  coordinates.   Work  in  this  direction  was 
Initiated  by  Dunn  who  in  a  series  of  interesting  papers  [3,4,5]  gave  a  number  of 
procedures  for  finding  rectangular  confidence  regions.   Recently,  Sidak  [15]  ex- 
tended the  work  of  Dunn  and  established  some  conservative  properties  of  some  of 
her  procedures  in  [3,4]. 

In  nonparametric  theory,  attempts  to  meet  the  need  of  procedures  relavant  to 
such  problems  have  mostly  been  made  in  the  univariate  set-up  (cf.  [6,7,8,13,14]). 
In  the  multivariate  set-up,  with  the  notable  exception  of  Dunn  [5]  who  considered 
the  problem  in  the  bivariate  case,  nothing  seems  to  have  been  done  so  far.   The 
object  of  the  present  paper  is  to  develop  some  nonparametric  confidence  regions  for 


-1- 


the  vector  of  location  parameters  in  the  multivariate  one-sample  and  two-sample 
problems,  and  to  study  their  properties.   These  confidence  regions  are  based  on  a 
class  of  rank  order  statistics  studied  in  [2,9,11],  and  are  multivariate  generaliza- 
tions of  the  results  in  [13>1^]' 

2.   CONFIDENCE  REGIONS  FOR  THE  MULTIVARIATE  ONE-SAMPLE  LOCATION  PROBLEM 
Let  X  =  (X^  , . . . ,X  )',  a=l,...,N  be  an  independent  sample  from  an  absolutely 
continuous  cumulative  distribution  function  (cdf)  Fg(^)  =  F(;^-^)  ,  where  ;^=(x^, .  . .  ,x  )  ' 

and  G  =  (9,, ,6  )'  .   F(x)  is  assumed  to  be  diagonally  symmetric  about  0,  i.e.,  it 

is  assumed  that  the  density  function  f(x)  of  F(^)  is  invariant  under  simultaneous 
changes  of  signs  of  all  its  coordinates.   The  problem  is  to  attach  a  confidence 
region  for  6. 

For  every  univariate  sample  ^         =  (X.^,...,X  ),  consider  the  rank  order  sta- 
tistic 

T   .(X^^^  =  (1/N)y  ",E^/^Z^,^\  i=l,...,p,  (2.1) 

N,x  'V   '    ^    ^-0=1  N,a  N,a'     '    ^ 

where  Z,,   is  one  or  zero  according  as  the  a-th   smallest  observation  among 

N,a  " 

Ix. , I , . . . , |x.„|  is  from  a  positive  X  or  not,  and  E,,   is  the  expected  value  of  the 
a-ik   order  statistic  of  a  sample  of  size  N  from  a  distribution  f*(x),  given  by 

'i'*(x)  =  H'.(x)-*l'.(-x),  if  x>0,  and  is  0  otherwise,  (2.2) 

where  *i'.(x)  satisfies  the  assumptions  I,  II  and  III  of  Puri  and  Sen  [8],  (a=l,...,N; 
i=l,...,p).   The  statistics  T   .,  i=l,...,p,  defined  above  form  a  general  class  of 

IN  ,  1 

statistics.   Two  important  members  of  this  class  are  (i)  the  Wilcoxon  signed  rank 

statistic  obtained  by  taking  for  H'.(x),  the  rectangular  distribution  over  (-1,1), 

and  (ii)  the  normal  scores  statistic  obtained  by  taking  for  H'.(x)  the  standard 

normal  cdf . 

2A.   Bonferroni  Confidence  Bounds  for  6.   Under  the  null  hypothesis  5=0,  the 

'^  N 

distribution  of  each  T   •  (^   )  is  symmetric  about  hK^     where  E^^  =  I   E^'^VN. 

'■'"  a=l   ' 
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Furthermore,  in  this  case  each  T   • (X   )  can  have  2   equally  likely  realizations. 
Thus  for  each  T   . (X   ) ,  it  is  possible  to  select  a  constant  a^  ^,  such  that 

HlT^^.q^^^)  -  isEjJ'^l  l4iUt^^  ^   ^-"*'  i=l."-P-        (2.3) 
Thus,  on  setting  a*=a/p,  where  1-a  is  the  desired  confidence  coefficient 
for  D,  we  obtain  by  using  Bonferroni  inequality, 

P^lT^^i^^^^  -  ^4^^!  1  4%^   i=l,...,p|^=0}>l-a.  (2.4) 

For  small  values  of  N  and  standard  T   .'s  (such  as  Wilcoxon's  signed  rank 
Statistic),  exact  values  of  a^  ^'s  may  be  obtained  from  the  existing  tables.   How- 
ever, if  N  is  large,  then  (cf.  [11])  each  T   .  has  a  normal  distribution  in  the 

N  ,  1 

limit,  we  find  that  asymptotically  as  N^<» 

'^""^^a*  -  ^^V2p\,il^  (2.5) 

where  T   is  the  upper  100  £%  point  of  the  standard  normal  distribution,  and 

XT 

,(i) 


'  a=l   ' 


By  definition,  each  T   . (X    -  t.I  ),  where  I   is  a  unit  vector  of  N  elements. 


is  non-increasing  in  t.,  i=l,...,p.   Hence  defining 


5j"--Pt'i^   ^N.l'^^'-Wi-^^"-^'''  "■'> 


t' 


-  lnf{t.:   T^,iq'"-tiVi'^SN"+4"'  <^-« 


and  proceeding  as  in  Sen  [13],  we  find  that 

PlS^""^  1  ^i  1  6^^^  i=l,...,p}  >  1-a  (2.9) 

which  gives  the  desired  confidence  region  for  6  with  confidence  coefficient  greater 
than  or  equal  to  1-a.   It  may  be  noted  that  (2.9)  is  a  translation  invariant  con- 
fidence region  for  6,  since  9    and  §    are  both  translation  invariant  (cf.  [13]), 
i=l, . . . ,p. 
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2B.   Maximum  Modulus  Confidence  Bounds  for  0.   Let  us  denote  by  R.  the  rank 

of  |x.  I  among  |x   |,...,|x   |,  for  a=l,...,N,  i=l,...,p.   Also  let  C  be  1  or 

-1  according  as  X.   is  positive  or  negative,  for  a=l,...,N,  i=l,...,p.  Let  then 
^     -  ,  ,  ,^,,,  N  ^(i)   ^(j)   ^(i)^(j)   , 


am)r\Ki      K:i      C^^^C^J^  i.j=l,...,p.  (2.10) 

^a=l  N,R.   N,R.   a   a  '   '-^   '    '*^ 


Note  that  v    .  =  {(N-l)/N}Ar  .,  where  A^  .  is  defined  by  (2.6).   We  denote  by 
y  =  ((v   ..))  and  its  reciprocal  by  ^   =  ((v  ~^)).   From  the  results  of  [11],  it 
follows  that  when  5=0,  the  permutation  (conditional)  distribution  of 


1=1  j=l 


1. 


over  the  2   (conditionally)  equally  likely  realizations  {((-1)   X  , . . . , (-1)   ^) » 
i.=0,l;  j=1,...,N},  is  asymptotically  chi-square  (central)  with  p  degrees  of  freedom. 
Also,  applying  some  elementary  results  on  matrix  algebra,  we  obtain  that 

s-»ii.i:?!.,p""'i^».i-^4"i^N,ii>-  «•!« 

Hence,  we  obtain  that  for  6=0  and  large  N, 

^^l\,i"^^N^^I-^^%,iXp,a  ^(N-l)/N,  i=l,...,p}  >  1-a,  (2.13) 

where  y^    is  the  upper  100  a%  point  of  the  chi-square  distribution  with  p  degrees 
^p,a 

of  freedom.   As  (2.13)  is  analogous  to  (2.4),  we  may  proceed  as  in  (2.7)  to  (2.9) 
with  a^  ^   replaced  by  ^A^  .x   /(N-1) /N,  i=l,...,p),  and  obtain  an  analogous  con- 
fidence set  for  9.   It  follows  from  (2.5)  and  (2.13)  that  the  Bonferroni  bound  in 
(2.9)  will  be  asymptotically  shorter  than  the  corresponding  bound  obtained  by  the 

above  maximum  modulus  method  if  t^ ,„   <  Y^   ,  0  <  a  <1,  and  this  may  be  easily 

a/2p  —  '^p,a 

verified  to  be  true  by  a  look  at  any  standard  statistical  table,  such  as  the 
Biometrika  tables  for  statisticians.   Besides  this  advantage,  the  Bonferroni  bound 
can  also  be  obtained  for  small  values  of  N,  (by  using  existing  tables  for  a^   *  ^^ 
(2,4)),  where  as  (2.13)  is  essentially  a  large  sample  expression. 

It  is  worth  mentioning  that  for  the  special  case  of  Wilcoxon's  signed  rank 


statistic,  (2.7)  -  (2.9)  can  be  considerably  simplified  as  follows.   Let 

Z^H  <  ...  <  Z^,^,^  be  the  M  =  N(N+l)/2  ordered  values  of  (X.  +X.„)/2,  l<a<3<N. 
(1)         (M)  la  ip     

Then,  using  the  alternative  form  of  the  signed  rank  statistic  by  Tukey  [16],  it  is 
easily  seen  that  (2.9)  is  equivalent  to 

(For  a  numerical  example,  see  [8]  and  a  graphical  procedure,  see  [7]).   The  confi- 
dence region  corresponding  to  (2.13)  may  be  obtained  exactly  in  the  same  way.   In 
general,  for  T,  different  from  a  rectangular  cdf,  the  computation  of  (2.9)  involves 
a  trial  and  error  procedure  (cf.  [8]). 

2C.   Asymptotically  Scheffe's  Bounds  for  6.   Let  us  denote  by  Fp.i(x)  the 

marginal  cdf  of  X.   and  by  J*  .(u)  =  f*   (u) :  0<u<l,  i=l,...,p.   Also  let 

T  ^ 

^l   =  N^  ^,i  =  ^  J(*)(")du,  i=l....,P,  (2.15) 

OO 

^i  =  -^  ^  J(.)[F^.^(x)]dF^.j(x);  J.(u)  =  '^J^^u),  i=l,...,p.      (2.16) 


Then  from  the  results  of  Sen  [13], 

^a*/2^i^'"  '"U 


B.  =  (2t^.  ,A-)/N^(9n'"^-9T^^).  i=l,...,P  (2.17) 


(where  0    and  6    are  defined  in  (2.7)  and  (2.8)  respectively,)  is  a  consistent 

estimator  of  B.  for  each  1=1,..., p,  no  matter  whether  6  is  0  or  not.   Next,  let 

R*  be  the  rank  of  X.   among  (X. ,  , .  .  .  ,X.^,)  .   Then,  from  [8], 
la  la        il      iN  ' 

N 
J=((0,J),   where   v, ,  =  I   E^^l^     E^^l^  (2.18) 


;  )).   where   v   -  I  E^  ^^  E^  ^^ 

■^  •'       a=l    la     J  a 


is  a  consistent  and  translation-invariant  estimator  of  the  covariance  matrix  of  v. 

a. 

Here  E.,   is  the  expected  value  of  the  a-tk   statistic  of  a  sample  of  size  N  from 

N,a 

a  distribution  T*(x) .   Hence,  from  the  results  in  [11]  it  follows  that 

H   =  4N  I       j  V^^[T,^,-^aE^'^][T   -^^^]  (2.19) 

1=1  j=l 
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where  ((v..))    =  ((v"'"-')),  has  asymptotically  the  chi-square  distribution  with  p 
degrees  of  freedom.   Now  let  us  denote 

ef )  =  sup{t.:   T^,.q(^)-t.;^^)  >  kE^^h,  (2.20) 

ef  >  =  inf{t.:   T^,.(^(i>-t.4,)  <  ^-.if )},  i=l,...,p 
and 

e .  =  (eJ^^+e^^^)/2,  i=i,...,p.  (2. 21) 

Then,  it  is  easy  to  verify  (along  the  lines  of  [13])  that 

N    )      y  v^^(0.-e.)(e.-e.)B.B.  (2.22) 

1=1  3=1     -   -   3   3   ^1 
has  asymptotically  the  chi-square  distribution  with  p  degrees  of  freedom.   Thus 

p{N    )      )  v^^(e.-0.)(e.-e.)B.B.  <  xf    }  %  i-a  (2.23) 

.1.1     1  1   J  J  1  J  ~  y>0L    — 

l=lj=l  JJ      J      r> 

This  gives  Scheffe-type  confidence  region  for  6 

The  asymptotic  relative  efficiency  of  the  Scheffe-type  confidence  region  de- 
fined in  (2.23)  with  respect  to  the  one  based  on  Hotelling's  T^-statistic  as  measur- 
ed by  the  inverse  ratio  of  the  volumes  of  the  two  confidence  regions  raised  to  the 
power  1/p  (cf.  Wilks  [18],  p.  385)  is  the  same  as  the  asymptotic  relative  efficiency 
of  the  point  estimator  6  =  (0  ,...,9  )  with  respect  to  the  sample  mean  vector 
X  =  (X-.,...,X  )  raised  to  the  power  1/p.   Since  the  latter  is  studied  in  detail 
in  Puri  and  Sen  [10],  the  details  of  the  asymptotic  relative  efficiencies  of  the 
confidence  regions  are  omitted.   However,  it  is  worth  mentioning  that  the  asymp- 
totic relative  efficiency  of  the  Scheffe-type  confidence  region  based  on  the  normal 
scores  statistic  with  respect  to  the  one  based  on  Hotelling's  T  -statistic  is  1 
when  the  underlying  distribution  is  non-singular  p-variate  normal. 

It  is  also  interesting  to  compare  the  Bonferroni  bounds  derived  in  Section  2A 
with  the  Dunn-Sidak  bounds.  Using  the  results  of  Sen  [13],  it  can  be  easily  shown 
that  for  the  confidence  limits  in  (2.9) 


-6. 


N^^(0^^^-e^^^^2T^^/2\/^i'  i=l.---.P.  (2.24) 

where  A.  and  B.  are  defined  by  (2.15)  and  (2.16)  respectively.   It  also  follows 
from  the  results  of  Sidak  [15]  that  N   times  the  length  of  Dunn-Sidak  confidence 
limits  tends  to  2t  ^,„0.,  where  o^  is  the  variance  of  X.  ,  i=l,...,p,  (where  of 
course  the  difference  in  the  two  values  of  a*  being  of  the  order  hp}    is  neglected) 
Thus,  according  to  the  same  criterion  of  the  asymptotic  relative  efficiency  as 
mentioned  earlier,  the  asymptotic  relative  efficiency  of  the  proposed  Bonferroni- 


bounds  with  respect  to  Dunn-Sidak  bounds  is  equal  to 

P 

c 

XI   X 


(  n  a?B!/A?)-'-^P.  (2.25) 


3  =  1 

Bounds  for  (2.25)  can  be  easily  deduced  from  well-known  bounds  available  in  the 
literature.   For  example,  if  we  use  Wilcoxon  scores,  then  (2.25)  is  bounded  below 
by  0.864,  is  3/it  for  normal  F(x) ,  and  is  greater  than  unity  for  many  non-normal 
cdf's.   Again,  for  normal  scores,  a?B?/A?  >^  1,  where  the  equality  sign  holds  only 
when  the  parent  cdf  is  normal.   Thus,  (2.25)  for  normal  scores  is  bounded  below  by 
unity  for  all  F(x).   Consequently,  the  Bonferroni-bounds  based  on  coordinate-wise 
normal  scores  statistics  are  asymptotically  at  least  as  efficient  as  the  Dunn-Sidak 
bounds. 

3.   MULTIVARIATE  TWO  SAMPLE  LOCATION  PROBLEM 

Independent  samples  from  the  p-variate  absolutely  continuous  cdf  F(x)  and  F(^-;^) 
respectively.   The  problem  is  to  obtain  the  confidence  region  for  A  =  (A,,..., A  ). 

As  before  we  work  with  the  sequences  of  rank  functions.   Let  R.    and  R.^   be  the 

^  xa      ip 

ranks  of  X.   and  Y.„  respectively  when  the  observations  corresponding  to  the  x-tk 

la     i3 

variate  of  both  the  samples,  that  is,  (X.t,...,X.  ,Y..,...,Y.  )  are  arranged  in 

^    '  xl      xm   xl      xn 

ascending  order  of  magnitude.   Consider  now  the  following  rank  order  statistic 

based  on  X^"-^  =  (X.,,...,X.  )  and  Y^""^  =  (Y   , . .  .  ,Y   ) , 
^        xl      xm      ^        XI      xn 


-7- 


-.,i<^'"4*")  -  *^«ia:i^»;i<».  i.i,....p.  (3,1, 

and  denote 

a=l  ot=i 


(i) 

N,a 

of  size  N  from  some  distribution  T.(x)  which  satisfies  the  assumptions  of  Section  2. 


where  we  assume  that  E    is  the  expected  value  of  the  a-th   statistic  of  a  sample 

N,a 


3A.   Bonferroni  Confidence  Bounds  for  ^.      Under  the  null  hypothesis  /^=0,  each 
T   .  can  have  N!  equally  likely  realizations  obtained  by  all  possible  permutations 

IN  ,  1 

of  the  ranks  of  (X. -j^, .  . .  ,X.  .Y. -j^, .  . .  ,Y^^)  over  1,...,N.   Thus,  it  is  possible  to 
select  two  values  of  T   .,  say,  a^   and  b    such  that  for  each  i=l,...,p 

Thus,  on  setting  a*=a/p,  we  obtain  by  using  Bonferroni  inequality  that  corresponding 
to  the  confidence  coefficient  1-a, 

^^4'^  <  TN,i('^^'^'^^'^>  ^^n'^'  i=l.---.pl;^=0>  11-^-  (3.4) 

For  small  values  of  N,  and  specific  T   .'s  such  as  the  Wilcoxon  two  sample 

IN  ,  1 

Statistics,  a,   and  b,,   may  be  obtained  from  the  existing  tables.   However,  if  N 

N         N 

is  large,  then  since  [9],  T   =  (T   , . . . ,T   )  has  a  p-variate  normal  distribution  in 
the  limit,  we  have  asymptotically 

la^i^-E^i)  +T  ,,  A^  .  C^)"^!  =  o  (1)  (3.5) 

'  N  N      a/2p  N,i  mn   '     p 

and 

'  N    N      a/2p  iSI,i  mn   '     p 

where  A,,  .  is  defined  in  (3.2),  and  x  ,„   is  defined  below  (2.6). 
N,i  a/2p 

Now,  by  definition,  for  each  i=l,...,p,  T   .(X  ^  -t.I  .Y  ^  )  is  nonincreasing 

in  t . .   Hence,  defining 
1 


and  proceeding  as  in  Sen  [13],  it  follows  that 

P{A.  <  A.  <  A.„,  i=l,...,p}  >  1-a  (3.8) 

iL—  1  —  lU  — 

which  gives  the  desired  confidence  region  for  A.   It  may  be  noted  that  A   and 

A.   are  both  translation  invariant  estimates  of  A.  (cf.  [13]).   For  the  Wilcoxon 
lU  1 

statistic  the  confidence  regions  may  be  obtained  as  in  Section  2B,  see  also  [6] 
and  [7]. 

Since  the  maximum  modulus  confidence  bounds  for  A  are  subject  to  the  same 
criticisms  as  for  6,  the  discussion  of  such  bounds  is  omitted  from  this  section. 
Finally,  analogous  to  section  2C,  we  have  the  following. 

3B.   Asymptotically  Scheffe  Bounds  for  A.   Let  us  denote  by  F|-.,(x)  the  marginal 
cdf  of  the  ±-th.   variate  of  the  cdf  F(x),  and  by  J,.,(u)  =  T.  (u)  ,  i=l,...,p.   Also 
let 


^l   =  N^:  ^.i  =  ^'  '^(i)^^^'^^  -  ^i      ^(i)(">^")^  -^'---'P 


(3.9) 


"^  d 
^i  =  ^  d^  •^(i)f^[i]^''^l  '^^[i]^''^'  ^=^'---'P-  ^^-^^^ 


Th 


en,  from  Sen  [13],  irrespective  of  A=Q   or  not, 


B.  =  (N/n>n)^(2A.T^/2p)/(V^iL>  ^^'^^^ 

is  a  consistent  estimator  of  B.  for  all  i=l,...,p. 

Let  now  R*    be  the  rank  of  X.   among  (X.  ^  , .  .  .  ,X  .^,)  ,  a=l,...,m  and  R*^        be 
la  la        il    '  iN  '  ip 

the  rank  of  Y   among  (Y   , . . . ,Y   )  3=1> • • • ,n  for  each  i=l,...,p.   Denote 

Of!^  =  -  I   E^'^  ...E^J)  ._,  -  (i  I   E^^hi^     I   E  ^Ih  (3.12) 

^J    "  a=l  m,R*^l^  m,R*^l)    *"  a=l  ""'^  "^  a=l  "''^ 
la     J  a 


0(2)  =  1  ^E^i)     .a)     _(1  ^E(i))(i  ^E^Jb         (3.13) 
^J    "  a=l  n,R*^2)  ^   (2)    n  ^^^  n,a  n  ^^^  n,a 

v..  =  (mvf^^  +  nOf^^N,  i,j  =  l,...,p  (3.14) 

0  =  ((C..)),  0"^  =  ((0^J))-\  (3.15) 

Then,  since  [9],  M  i^^-^^)    (where  E^  =  (E^  \...,E^^h)    has  a  multinormal  dis- 
tribution in  the  limit,  it  follows  that  under  /^=!^, 

has  asymptotically  the  chi-square  distribution  with  p  degrees  of  freedom.   Let  us 
now  define 

Then,  proceeding  as  in  Sen  [13], 

—  )   )  v^(A.-A.)(A.-A.)  B.B.  (3.18) 

1=1   J=l  -"     -^  -^ 

has  asymptotically  the  chi-square  distribution  with  p  degrees  of  freedom.   Con- 
sequently 

P{  y   y  v^^B.B.(A.-A.)(A.-A.)  <  x!  }  '^  1-a  (3.19) 

±tl  ^ti  1  J   1   1   2      2-   ^.ot  - 

which  is  the  desired  confidence  bound  for  A. 

The  asymptotic  efficiency  of  the  Scheffe  type  confidence  region  defined  in 
(3.19)  with  respect  to  the  one  based  on  Hoetlling's  T^-statistic  is  the  same  as 
that  of  the  corresponding  point  estimate  A  =  (A-.  , .  .  .  ,A  )  with  respect  to  the  sample 
mean  vector  Y-X;  the  details  of  which  are  discussed  in  [10].   As  in  the  one  sample 
problem,  here  too,  the  confidence  regions  based  on  the  Bonferroni  inequality  using 


•  10- 


normal  scores  statistic  are  asymptotically  at  least  as  efficient  as  the  Dunn-Sidak 
confidence  regions. 
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